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ABSTRACT 

Motivation: Time-evolving differential protein-protein interaction (PPI) 
networks are essential to understand serial activation of differentially 
regulated (up- or down regulated) cellular processes (DRPs) and their 
interplays over time. Despite developments in the network inference, 
current methods are still limited in identifying temporal transition 
of structures of PPI networks, DRPs associated with the structural 
transition and the interplays among the DRPs over time. 
Results: Here, we present a probabilistic model for estimating Time- 
Evolving differential PPI networks with MultiPle Information (TEMPI). 
This model describes probabilistic relationships among network 
structures, time-course gene expression data and Gene Ontology bio- 
logical processes (GOBPs). By maximizing the likelihood of the prob- 
abilistic model, TEMPI estimates jointly the time-evolving differential 
PPI networks (TDNs) describing temporal transition of PPI network 
structures together with serial activation of DRPs associated with tran- 
siting networks. This joint estimation enables us to interpret the TDNs 
in terms of temporal transition of the DRPs. To demonstrate the utility 
of TEMPI, we applied it to two time-course datasets. TEMPI identified 
the TDNs that correctly delineated temporal transition of DRPs and 
time-dependent associations between the DRPs. These TDNs provide 
hypotheses for mechanisms underlying serial activation of key DRPs 
and their temporal associations. 

Availability and implementation: Source code and sample data files 
are available at http://sbm.postech.ac.kr/tempi/sources.zip. 
Contact: seungjin@postech.ac.kr or dhwang@dgist.ac.kr 
Supplementary information: Supplementary data are available at 
Bioinformatics online. 

1 INTRODUCTION 

Many cellular events involve serial activation of cellular pro- 
cesses during which genes/proteins associated with the processes 
are up- or downregulated. Differential protein-protein inter- 
action (PPI) networks (DNs) have been used to delineate PPIs 
(edges) among differentially regulated nodes (DRNs), such as 
up- or downregulated genes or proteins. The DNs have been 
considered more effective for understanding the differences be- 
tween two conditions, compared with non-DNs (de la Fuente, 
2010). However, the DNs delineate no temporal transition of the 
DRNs and/or their edges to represent serial activation of cellular 
processes over time. Thus, time-evolving differential PPI net- 
works (TDNs) have been introduced to delineate (i) temporal 
changes in abundances or activities of DRNs (node transition). 
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and/or (ii) formation of new edges for the DRNs and disappear- 
ance of existing edges over time (edge transition). The TDNs are 
essential to understand serial activation of differentially regu- 
lated cellular processes (DRPs) during a cellular event and 
their underlying mechanisms. 

Time-course gene expression analysis can provide temporal 
changes in abundances of the DRNs (Hwang et aL, 2009). 
Several interaction assays, such as yeast two-hybrid (Ito et aL, 
2001; Uetz et aL, 2000; Yu et aL, 2008) and mass spectrometry- 
based tandem affinity purification (CoUins et aL, 2007) can be 
used to measure PPIs among the DRNs. However, it is still 
challenging to experimentally identify temporal transition of 
the edges among the DRNs because of the limited coverage of 
the interactomes detected by these assays (von Mering et aL, 
2002). 

The limitation of the experimental methods prompted us to 
develop a computational method to estimate TDNs. Many meth- 
ods for estimation of dynamic gene regulatory networks have 
been developed (Kim et aL, 2014). However, estimation of tem- 
poral transitions of differential PPI networks (i.e. TDNs) has 
been rarely studied. A couple of methods have been developed 
to identify differential PPI networks, which can be then used to 
estimate TDNs. First, a simple method to infer DNs using time- 
course gene expression data identifies DRNs over time and con- 
structs a template PPI network with the known PPIs among all 
the DRNs (Hwang et aL, 2009; Przytycka and Kim, 2010). 
TDNs can be then constructed by selecting the interacting 
DRNs with significant expression changes at each t from the 
template PPI network. Second, principal network analysis 
(PNA) identifies differential expression patterns over time and 
then selects DRNs and their edges (known PPIs between the 
DRNs) showing the differential expression patterns (Kim et aL, 
2011). A principal subnetwork (PS) is then constructed using 
both DRNs and edges selected for each differential expression 
pattern. Finally, TDNs can be constructed by selecting the edges 
in PSs for which the linked DRNs show significant expression 
changes at each t. 

Functional interpretation of the inferred TDNs is important to 
understand temporal transition of the DRPs. In most of the 
network inference methods, it is commonly performed independ- 
ently from network inference using post hoc analyses of Gene 
Ontology biological processes (GOBPs) of the nodes in the 
inferred networks (Kim et aL, 2014). For example, the method 
proposed by Park and Bader (2012) clusters the nodes in time- 
evolving networks based on the similarity of temporal transitions 
of their edges and then links these clusters to cellular functions 
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using GOBPs. However, none of the methods estimating DNs or 
TDNs integrates functional information, such as GOBPs, during 
the network inference such that the inferred TDNs can represent 
directly temporal transition of differentially regulated GOBPs 
and time-dependent interplays between the GOBPs, thereby 
facilitating functional interpretation of the TDNs. 

Here, we introduce a probabiHstic model for estimating Time- 
Evolving differential PPI networks with MultiPle Information 
(TEMPI). Although many methods have used probabilistic mod- 
eling for estimating network structures (Friedman et ai, 2000; 
Ong et ai, 2002; Song et al., 2009), a unique aspect of our model 
is that it models additionally probabilistic dependencies of 
GOBPs with network structures and time-course global data. 
By maximizing the likelihood function of the probabilistic 
model, TEMPI jointly estimates the TDNs showing temporal 
transitions of network structures with temporal activation of 
the GOBPs and their temporal interplays. During the network 
inference, TEMPI infers edges not included in the known PPIs, 
whereas most of the previous methods (e.g. PNA) select a subset 
of PPIs for estimation of TDNs from the known PPIs. 



2 FRAMEWORK OF TEMPI 

TEMPI uses the observed data (time-course gene expression data, known 
PPIs and GOBPs) as the input variables, estimates the output variables 
based on the probabilistic graphical model describing probabilistic depen- 
dencies among the input and output variables, and then infers TDNs 
using the estimated output variables. First, TEMPI uses the following 
three observed variables as the inputs. As the first input, TEMPI uses the 
time-course gene expression log2-fold-changes (dynamic data) of n nodes 
at T time points, with R biological replicates (an n x T x R array E in 
Supplementary Fig. SI, bottom left). Estimation of the TDNs only using 
E can be an underdetermined problem (De Smet and Marchal, 2010). To 
reduce this issue, as the second input, TEMPI uses known PPI data 
(an n X n adjacency matrix ; static data) of n nodes (Supplementary 
Fig. SI, top left). In TEMPI, the are converted into positions of n 
nodes in a /^-dimensional latent space (an x p positional matrixX^) using 
multidimensional scaling (MDS; Supplementary Information Sl.l; 
Higham et al., 2008; You et al., 2010). In this study, we used the 2D 
latent space (Supplementary Information Sl.l). MDS locates the inter- 
acting nodes closely in the latent space: for example, for nodes A-E in G^ 
(Supplementary Fig. SI, top center), MDS located A-D closely, but E 
distantly from A-D in a 2D latent space. To identify TDNs, TEMPI then 
selects m DRNs from time-course gene expression data (Supplementary 
Information SI .2). TEMPI uses the m x p X' for the selected DRNs as an 
input. Finally, as the third input, TEMPI further uses the GOBP data (/ 
GOBP terms assigned to m DRNs; static data), a m x / binary node- 
GOBP matrix T in which tik = 1 when node / has GOBP k (e.g. a T 
for nodes A-F in Supplementary Fig. SI, top right). 

Second, TEMPI uses a probabilistic graphical model (Fig. 1) that de- 
scribes probabilistic dependencies (see Section 2.1) among the input 
observed variables {Xo = [T,E, X^}) and the following output hidden 
variables for m DRNs at each t : (i) 3. m x p positional matrix X^ repre- 
senting positions of n nodes in a /7-dimensional space at t, (ii) an m x m 
adjacency matrix G^ representing the presence of edges between m nodes 
at t and (iii) a m x / node-GOBP matrix representing differentially 
regulated GOBPs at in which a[j^ = 1 when GOBP k is estimated to be 
differentially regulated for node /; otherwise, a^j^^ = 0. The output hidden 
variables were then estimated through the optimization (see Section 2.2). 
Third, using the estimated outputs (Xh = {X\ G\ A^^), the TDNs can be 
constructed as described in Section 2.3 ('Inferred TDNs' of 
Supplementary Fig. SI, bottom right). The resulting TDNs have the 
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Fig. 1. The probabilistic model describes dependencies among the 
observed variables Xo = [T, E, X^] and the hidden variables 
X„ = {X',G',A'] 



following characteristics: (i) a pair of interacting DRNs according to G^ 
are likely to be linked when they share expression changes and GOBPs 
[e.g. the interacting A-B in G^ (Supplementary Fig. SI, top left), both of 
which were upregulated (Supplementary Fig. SI, bottom left) and share 
GOBPs 3 and 6 (Supplementary Fig. SI, top right), were linked in G^ at 
t= 1 (Supplementary Fig. SI, bottom right)]; (ii) the links between non- 
interacting DRNs according to G^ can be inferred when they share ex- 
pression changes and GOBPs (e.g. non-interacting B-D in G^, both of 
which were upregulated and share GOBPs 3 and 4, were linked in G^ at 
t = 3); and (iii) the GOBPs assigned to the interacting DRNs in T are 
likely to be differentially regulated (e.g. the interacting A-B at / = 1 have 
GOBPs 3 and 6 in T, which are co-differentially regulated in the esti- 
mated A^). 



2.1 Probabilistic graphical model 

The probabilistic graphical model (Fig. 1) was constructed to include the 
following dependencies between the input observed Xo = [T, E, X^^ and 
the output hidden variables X^ = [X^, G\ A^]: (i) X' depends on the initial 
positions of nodes (X^) and their positions at / — 1 (A^~^) to achieve 
smooth transition of TDNs over time; (ii) G\ a geometric graph, depends 
on the distances between the nodes in the latent space and thus on X^; 
(iii) E depends on the interactions (G"') between the nodes based on the 
observation in real PPI networks that the nodes with similar expression 
changes are likely to interact (Grigoriev, 2001; Supplementary Fig. S2A); 
and (iv) A^ depends on G^ and the node-GOBP matrix T, based on the 
observation in real PPI networks that the nodes with the same GOBPs 
are likely to interact (Sharan et al., 2007; Supplementary Fig. S2B). 

Based on these dependencies, the probabilistic model was defined by 
the following four submodels (see Supplementary Information SI. 3 for 
further details of the four submodels): 

• Transition model (P^''""-^) for m nodes is defined as a product of m 
Gaussian distributions: P^'"'"' = P{X'\X'-\ X^) =Y[';i^Af{x'.\ml,Xi) 
where ml = {x''^ /a^ ^ xj/(rj)/{l/aj ^ l/aj) and X, = [l/ {l/a^ + 
1 /crj)]I. and / are the positional vector for node / at t and the iden- 
tity matrix, respectively, at and aj control the penalties of displace- 
ment of nodes from positions (A^~^) at /-I and initial positions 
(X^), respectively. 

• Link model (p^'"^') is modeled as a product of Bernoulli distribu- 
tions for m(m- l)/2 pairs of m nodes: P^'^^ ^ P{G'\X') ^ Hvoj) 

''^1 — p\i^ '' where link probability {p\j) of nodes / and j is 
defined by p{g'y = 1 l^-) = A/'(4 |0, ct^/M{0 |0, Og) with 
dij=\\x\ — x^j\\2. p\j decreases as increases, and its decreasing 
rate is controlled by Og. 
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• Expression model {P^^p^) at t is defined as a weighted product of 
P{E^\G^) for all time points t, where is the m x R log2-fold- 
change matrix for m nodes, with R biological replicates at time t, 
and el is column r of P{el\G^) is then modeled as the product of 
the mixtures of Gamma distributions for all pairs of m nodes with 
two sets of parameters, (ke^, 6e^) and {ke^, ^^2)- 



(1-4) 



v(/,/) 

where<^.^ = l4. -eJJ + exp (^-{elf /c^ + Qxp (^-(4) /^)- ^ and 6' 
are determined to produce a higher probability in the first Gamma 
distribution than in the second one for a small d^j. Finally, we used a 
radial basis function kernel, k(T, /|v) = exp ((r — t)^/v), to weight P 
(E^IG^) in its weighted product (Song et al., 2009): 

T I R \ 



This weighting scheme ensures (i) smooth transitions of TDNs, and 
(ii) a more significant dependency of G^ on at t = / than other is 

0- 

• Ontology model (P^^) is factorized into P{A^\G^) and 
{A'\T):P^^ =P[A'\G\T)=P{A'\a)P{A'\T). First, P[A'\T) is 
defined by the product of Bernoulli distributions for / GOBPs of 
m nodes: 



^hQXQ f{tki)-{iff{k, T)/2-sy'^e^-''> {e = 1x10"^), given an inverse 
function frequency iff{k, 7) = log (m/J^i^ik) that penalizes general 
GOBPs. The normalization constant s was defined as the maximum 
value of iff(k, T). Second, P(A^\G^) was modeled as the mixture of 
Gamma distributions for all pairs of m nodes using two sets of 
parameters, {ko^,6o^) and {ko-^,9o^y. 



p{A'\G^) = Yl r(afa'^/h\ko,,eoy''r(al'^a'f/h\ko,,eo^ 



v(/V) 



where k and 9 are determined to produce a higher probability in the 
first Gamma function than in the second one for a large 
Many previous methods have used probabilistic models for estimating 

the network structures using E (Friedman et al., 2000; Ong et al., 2002; 

Song et al., 2009). However, a unique aspect of our model is that it 

includes the ontology model to integrate GOBP data (A^ and J) during 

the probabilistic estimation of TDNs. 



variational approximation (Beal, 2003) to estimate the variational distri- 
bution QiXfj), assuming the independency among the variables: 

Q{Xh) = Q{X%{G')Q{A^) where Q{r) =Yrj^,Af(x^.\fJil all), Q{G^) = 

n;Li5er(4.|t0, and Q{A') = Ul,Ui^,Ber{ai,Wi^). In these distribu- 
tions, fil, Gq, and are the variational parameters (Supplementary 
Information SI. 5). Finally, we determined the variational parameters 
{fjL\, and such that they maximize the lower bound of the marginal 
likelihood function as described in Supplementary Information SI. 5. 

2.3 Construction of TDNs 

Using these variational parameters {fjL\, and the output hidden 
variables {Xh = [X^ , G\ A^^) were finally estimated. At each first, the 
position (xj, column i of X^) of node i was determined as the expected 
value of the posterior probability of x\ given the observed variables, 
P{x\\Xo). The variational distribution with the estimated variational par- 
ameters approximates the posterior probability distribution of the hidden 
variables given the values of the observed variables: P{Xk\Xo) = Q{Xh). 
With the independency among the hidden variables, P(x\ \Xo) = Q(x]), 
P(g^j=l\Xo)^Q(glj = l) and P{ai=l\Xo)^Q{ai=l). Thus, x] was 
determined as ynf, the expected value of Q(x'j). Second, nodes / and J 
were determined to be linked (i.e. g]j of G^ = I) when 
P{gij=l\Xo) ^Q{gij=l)=^lj>0.5. Using the estimated and G\ 
TDNs can be constructed as geometric graphs at individual time points. 

Finally, for functional interpretation of the inferred TDNs, GOBP k 
was determined to be differentially regulated (positively or negatively 
activated) at / for node i (a[- of A' = 1) when the log-likelihood ratio 
of posterior and prior probabilities of a[- = 1, log [P[a[-= l\Xo)/P(a[- = 
1)] - log[eK, = l)/^(4/ = 1)] = log ^ii/Md, was significantly 
(P<0.01) larger than zero. The P- value was computed using one-tailed 
/-test (degree of freedom = the number of nodes with GOBP k). Using 
the estimated A', the activation degree of GOBP k at t was estimated as 
the fraction of the nodes with activated GOBP k (<3[. = 1) in the network 
at / among all nodes with GOBP k (/fo = l): J2i {'^ki ' ^^Sn{e^i)) / Hi hi, 
where sign{e\) is the sign of J],. e\^ and r is the index of biological repli- 
cates. The sign was multiplied to distinguish positive and negative acti- 
vation of GOBP k for node / at Furthermore, the interaction degree for 
two activated GOBPs k and / at t was estimated as the fraction of the 
inferred edges between the two sets of the nodes with activated GOBP k 
{a]^i= 1) and activated GOBP / (<3{-= 1), respectively, in the network at t 
among all possible edges among the two sets of nodes with activated 
GOBPs k and /: 

Yl S\ia[ia\j^ign {sign{e\) + sign [e]^ ) / 4. j a]^ 

The sum of the signs of the linked nodes / and j was included such that the 
edge with different signs of the Unked nodes should have no contribution 
to the interaction degree. 



2.2 Optimization of the likelihood function 

The joint probability of the graphical model at each / is defined by 
P{G\E,T,A\X'\X'-\X^)=P"'''"'P^'''''P'''P''P'^^. The output hidden 
variables (Xh = [X' , G\ A^]) are then estimated by maximizing the 
log-likelihood function of the joint distribution, log P{G\ E, T, 
A\X'\X'-\X^) using variational inference (Beal, 2003). Briefly, for 
each t, we first calculated the lower bound of the marginal log- 
likelihood function, log P{E, T\X'-\ X^), which can be obtained by 
integrating the joint probability with respect to Xk (Supplementary 
Information SI. 4): for distributions of X^, Q(Xh), 

\ogP{T, E\X'-\ X^) > /g(X/01og(^ ^^^^''gf]^"''^'^ )jX/,. We then used a 



3 A SYNTHETIC TDN MODEL 

To demonstrate the perfomance of TEMPI, we generated a 
template PPI network and GOBPs to simulate characteristics 
of real yeast PPI networks and then sampled the synthetic 
TDN model from the template PPI network for which temporal 
transitions of (i) network structures and (ii) differentially regu- 
lated GOBPs associated with the TDN model are known. First, 
to generate a template PPI network with the characteristics of 
real yeast PPI networks, we used a geometric graph model with 
gene duplication and mutation (GEO-GD expansion model; 
Przulj et al, 2010; Supplementary Fig. S3 A). See 
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Fig. 2. Identification of differentially regulated GOBPs and TDNs associated with the GOBPs. (A) Activation degree heat map representing temporal 
activation of 103 GOBPs. Differentially regulated GOBPs were categorized into Group 1 [Gl-GOBPs 60, 178, 94 (bold) and their descendants (non- 
bold)]. Group 2 (G2-GOBPS 67, 228, 146 and their descendants) and Group 3 (G3-GOBPs 218, 206, 246 and their descendants). (B) Interaction degree 
heat maps showing temporal associations among the GOBPs in Groups 1-3. Color bars, gradients of the activation (A) and interaction degrees (B). (C) 
Inferred TDNs for GOBPs 60, 67 and 218 in Group 1 (first, second and third rows, respectively). Colored nodes at each / represent the linked nodes with 
a[- = 1 for = 60, 67 or 218. Black nodes at / represent the linked nodes with a[- = 0, but tj^j^O for k = 60, 67 or 218, and gray nodes at t represent 
colored or black nodes in G or G^~^ . Blue and red lines at / show the edges among DRNs (g-J = 1) and the displacement of DRNs from t-l, respectively 



Supplementary Information S2. 1 for the detailed procedure. The 
resulting template PPI network included 26454 edges for the 
3258 nodes (Supplementary Fig. S4A). This network was also 
used as the input PPI network (G'^). Second, to generate GOBPs 
with the characteristics of real yeast PPI networks, we assigned 
306 GOBP labels (T) to the 3258 nodes in the template PPI 
network using a modified version of network module (NeMo; 
Rivera et ai, 2010; Supplementary Fig. S3B). See Supplementary 
Information S2.2 for the detailed procedure. Among the 306 
GOBPs, we used 103 after removing 203 GOBPs assigned to 
> 100 or <5 DRNs, which can be too general or non-meaningful, 
respectively, for functional interpretation. Third, we then 
sampled a TDN model from the template PPI network by select- 
ing the nodes with (i) GOBPs 60, 67 and 218; (ii) GOBPs 178, 
228 and 206; and (iii) GOBPs 94, 146 and 246 at individual time 
points based on predefined fractions of the linked edges among 
the selected nodes over time (Supplementary Figs S3C and S4B). 
See Supplementary Information S2.3 for the detailed procedure. 
Finally, we generated time-course gene expression log2-fold- 
changes that reflect temporal transitions of the synthetic TDN 
model using MetropoHs-Hastings algorithm (Hastings, 1970; 
Supplementary Figs S3D and S4C). See Supplementary 
Information S2.4 for the detailed procedure. 

4 RESULTS AND DISCUSSION 

4.1 Application of TEMPI to the synthetic data 

To evaluate performance of TEMPI, we apphed it to the syn- 
thetic input data (J, E, G^). As the input PPI data, we used the 
PPIs in the template PPI network (C^; see Section 3). We first 
applied MDS to G^ for the 3258 nodes to compute in 2D 
latent space. To identify TDNs, we then identified 616 DRNs 
with false discovery rates (FDRs) < 0. 1 using a modified version 



of repeated measure-analysis of variance (RM-ANOVA) test 
previously reported (ElBakry et al, 2012) and maximum log2- 
fold-changes>0.58 (1.5-fold) at least at one time point 
(Supplementary Information SI. 2) and used the for the 616 
DRNs as an input data. For the synthetic GOBP data, we used 
103 GOBPs (7) for the DRNs as described in Section 3. Finally, 
we used the 616x6x3 synthetic log2-fold-changes for the DRNs 
as the input expression data (£). After applying TEMPI to these 
synthetic input data {X^ , T, E) for the 616 DRNs, the output 
variables {X\ G\A^) were estimated by the optimization of the 
likehhood function of the probabihstic graphical model (see 
Section 2.2). Using the X' and G\ TDNs (TEMPI- Gf) were 
inferred at individual time points (see Section 2.3). 

For functional interpretation of TEMPI-G'^ we first examined 
temporal activation of the 103 GOBPs represented by TEMPI-G"' 
(Fig. 2 A) based on the activation degrees of the 103 GOBPs 
computed using the estimated (see Section 2.3). The activation 
degrees revealed that three groups of GOBPs (Groups 1-3 in Fig. 
2A, left panels), among the 103 GOBPs, were differentially regu- 
lated early, mid and late over time, respectively. Notably, 
Groups 1-3 included GOBPs 60, 178 and 94 (Group 1); 67, 
228 and 146 (Group 2); and 218, 206 and 246 (Group 3), respect- 
ively, consistent to the predefined differential regulation of the 
three sets of the GOBPs (Supplementary Information S2.3). For 
example, the high activation degree of Group 1 at ^ = 1 indicates 
that a large number of the nodes with Group 1 are linked at 
t = \. The decrease of the activation degree from t = 2 indicates 
that decreasing numbers of the nodes with Group 1 are linked 
from t = 2. Moreover, the descendants of the predefined GOBPs 
in Groups 1-3, respectively, were partially differentially regu- 
lated. This is expected because the nodes assigned with the des- 
cendant GOBPs also have their parent GOBPs (Supplementary 
Fig. S3). We then examined temporal associations among the 
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GOBPs in Groups 1-3 represented by TEMPI-G'^ based on the 
interaction degrees of the GOBPs computed using the estimated 
and (Supplementary Fig. 2B; see Section 2.3). The inter- 
action degrees revealed that (i) Group 1 and their descendants; 
(ii) Group 2 and their descendants; and (iii) Group 3 and their 
descendants showed early, mid and late associations among 
them, respectively, consistent to the predefined differential regu- 
lation of their parent GOBPs. 

Finally, TEMPI-G^ (Fig. 2C) showed the transitions of nodes 
and edges over time from the initial network (G^; Supplementary 
Fig. S4A). Of note, because of the geometric representation of 
the TDNs, the linked nodes at t in TEMPI-G'^ were moved clo- 
sely to each other. The TDNs (Fig. 2C) for GOBP60 (Group 1), 
GOBP67 (Group 2) and GOBP218 (Group 3) correctly captured 
early, middle and late transitions defined in their true TDNs 
(Supplementary Fig. S4C). 

4.2 Comparison of TEMPI with previous methods 

To quantitatively assess the relative performance of TEMPI, we 
applied the two methods, the simple method (Hwang et aL, 2009) 
and PNA (Kim et aL, 2011) described in Section 1, to the syn- 
thetic data and identified TDNs as follows. For the simple 
method, as the input data, we used the synthetic E and in 
the template PPI network. We first identified the DRNs as the 
nodes with the maximum fold-changes > a cutoff of 2 or 1.5. For 
the DRNs, we constructed a DN using the G^. We then estimated 
the TDNs (Simple-G*) by selecting the interacting DRNs with 
fold-changes > the cutoff in the DN at each t or either of its 
neighboring time points (^-1 or ^ + 1) to reflect significant 
smooth transitions and then by linking the selected interacting 
DRNs (Supplementary Information S3.1). For the PNA appli- 
cation, as the input data, we used the same synthetic E and G^. 
PNA identified three activation patterns (HI -3 in Supplementary 
Fig. S5A) and then generated the three PSs (PS 1-3 in 
Supplementary Fig. S5B) that describe interactions among the 
DRNs showing the three activation patterns based on the input 
PPIs. By selecting both DRNs and edges showing PSl-3 at each t 
and then combining them, we identified TDNs (PNA-G'') at ^ = 1 
-6 (Supplementary Information S3. 2). 

We then evaluated the performance measures (precisions, re- 
calls and Fl scores) by comparing the TDNs inferred by the 
three methods with the true TDNs and compared the perform- 
ance measures of TEMPI with those of these two methods. 
Moreover, in many species, known PPIs (G^) are incomplete 
(Beyer et aL, 2007). Unlike TEMPI, both the simple method 
and PNA predict no edges not included in G^. Their performance 
can thus depend on completeness of G^. Thus, we further exam- 
ined robustness in accuracy of the inferred TDNs against incom- 
pleteness in G^ by inferring the TDNs using the three methods as 
randomly removing 10-90% of the PPIs in the template PPI 
network (G'^). Furthermore, to understand how the capability 
of TEMPI to predict edges not in G^ can contribute to the per- 
formance, we also compared the performance of TEMPI after 
removing the edge prediction capability by fixing to 0 when 
gjj = 0 during the optimization of TEMPI. PNA resulted in the 
highest precisions, but the lowest recalls, indicating that PNA-G"' 
had less false positives (FPs), but more false negatives (FNs), 
compared with TEMPI and the simple method (Fig. 3A and 
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Fig. 3. Comparison of TEMPI with two previous methods. (A) Precision, 
recall and Fl score for TEMFl-G\ PNA-C^ and Simple-G' with up to 
90% removal of the input PPIs. The performance can vary depending on 
which PPIs are removed. Thus, we performed 10 times of random sam- 
plings. Data are represented as meaniSD from the samplings. (B-C) 
Distributions of correlations of temporal profiles of degrees (B) and CC 
(C) in the three s with those in the true G*. (D) Changes in the inter- 
activities of GOBPs 60, 67 and 218 in the three s and the true G* (Left) 
and correlations between the interactivities in the inferred G* s and the 
true G^ (Right). See the legend for the lines used for the three methods 



Supplementary Table SI). By contrast, the simple method had 
lower precisions (more FPs), but high recalls (less FNs), com- 
pared with TEMPI and PNA. Interestingly, precisions of the 
three methods are robust to the amount of the removed PPIs, 
indicating the robust sensitivity of the methods for identifying 
the true links against the removal of PPIs. On the other hand, 
recalls of the simple method, PNA and TEMPI with no edge 
prediction capabihty linearly decreased with the increase of the 
amount of the removed PPIs. However, importantly, recall of 
TEMPI was robust up to 60% removal of the input PPIs. Based 
on the overall performance measure, Fl score defined as the 
harmonic mean of precision and recall, PNA was the best or 
comparable with TEMPI when the amount of the removed 
PPIs is <30%. However, TEMPI outperformed the other meth- 
ods when the amount of the removed PPIs is >30% and showed 
the robust performance against the removal of the PPIs. Also, 
TEMPI with no edge prediction capability achieved the similar 
performance to PNA, and the robustness against the PPI re- 
moval disappeared. These data indicate that the edge prediction 
capabihty of TEMPI recovered the removed links in TEMPI-G'^ 
In addition, it is important to assess whether the three methods 
correctly estimate temporal transitions of topological properties 
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in the true TDNs. Thus, we next examined the performance of 
the three methods in estimating transitions of the topological 
properties. To this end, we first selected TDNs reconstructed 
by the three methods when percentage of removal in input 
PPIs is 30% because they achieved similar Fl -scores. Then, for 
the nodes in the estimated by each method, we computed (i) 
degree and (ii) clustering coefficient(CC) profiles over time. 
Then, we calculated correlations of the degree and clustering 
coefficient profiles with those of the true TDNs (Fig. 3B-C). 
The comparisons revealed that temporal transitions of the topo- 
logical properties in TEMPI-G^, compared with FNA-G'^ and 
Simple-G'^ agreed best with those in the true TDNs. Structural 
changes in the G* s should be linked to transitions of cellular 
functions represented by the s over time. Thus, we examined 
how well structural changes in the G* s estimated by the three 
methods were linked to differential regulation of GOBPs 60, 67 
and 218 represented in the true TDNs (Supplementary Fig. S6 
for the other predefined six GOBPs). For TEMPI, as described 
in Section 4.1, the activation degrees of GOBPs showed that 
structural changes in the G* s well-represented early, mid and 
late differential regulation of the three GOBPs in the true 
TDNs (Fig. 2A). As another subjective measure previously re- 
ported (Song et al, 2009), we further defined 'interactivity' for 
each GOBP as the average number of the edges in G* among the 
nodes with the GOBP. For the GOBPs 60, 67 and 218, temporal 
changes of the interactivities obtained from TEMPI-G'^ achieved 
the highest correlation with those of the true TDNs, compared 
with PNA-C' and Simple-C' (Fig. 3D). All these data together 
indicate that TEMPI-G'^ represents effectively the temporal tran- 
sitions in the true TDNs in the structural (Fig. 3 A), topological 
(Fig. 3B-C) and functional (Fig. 3D) aspects, compared with the 
two previous methods. 

4.3 Application of TEMPI to the cell cycle data 

Cell cycle is a representative time- varying cellular process. 
Although small-scale TDNs for several molecules involved in 
cell cycle have been studied, large-scale TDNs for a comprehen- 
sive set of molecules for cell cycle are still largely unknown. Thus, 
we obtained time-course gene expression data (GSE8799) col- 
lected during the cell cycle of wild- type yeasts (Orlando et al., 
2008). We first applied MDS to the high quality of known yeast 
PPI data (G^; Supplementary Information S4.1) to calculate in 
a 2D latent space for 3258 nodes. We selected the 755 DRNs 
with FDRs<0.1 using the modified RM ANOVA test and max- 
imum log2-fold-changes>0.58 (1.5-fold) at one time point at 
least. To identify TDNs, we then obtained the following input 
data for the 755 DRNs: (i) log2-fold-changes (E) at six time 
points with two replicates, (ii) positions (X^) in a 2D space and 
(iii) 664 GOBPs (T) assigned to the 755 nodes (Supplementary 
Information S4.1). 

TEMPI generated G^ to G^ over six time points (GO, Gl, S, 
G2, G2/M and M phases) during the cell cycle. To understand 
functional transition represented by TEMPI-C^ we first exam- 
ined activation degrees of the 664 GOBPs. Among them, we 
focused on the 14 cell cycle-related GOBPs with pulsed changes 
of activation degrees, a characteristic of the cell cycle (Fig. 4A; 
Supplementary Information S4.2). They can be categorized into 
the following three groups: Group 1 , early activated GOBPs with 



the peaks at GO to Gl phase (DNA-dependent DNA replication 
initiation, DNA-dependent DNA replication, DNA strand 
elongation involved in DNA replication, DNA integrity check- 
point and telomere maintenance via telomerase); Group 2, 
middle activated GOBPs with the peaks at S to G2 phase 
(post-replication repair, sister chromatid cohesion, DNA packa- 
ging, spindle organization, spindle checkpoint, regulation of 
G2/M transition of mitotic cell cycle and nuclear division); and 
Group 3, late activated GOBPs with the peaks at G2/M to M 
phase [cytokinetic cell separation (CCS) and organelle inherit- 
ance]. The activation kinetics of these 14 GOBPs was largely 
consistent with their known kinetics during the cell cycle 
(Orlando et aL, 2008; Simon et al., 2001; Spellman et al., 
1998). For example, the expression of genes involved in DNA 
replication and repair reached peaks in Gl or S phases (Spellman 
et al., 1998), consistent to the kinetics of the GOBPs of DNA 
replication and post-replication repair in Figure 4A. 

We then examined the interaction degrees of the GOBPs to 
understand time-dependent associations among GOBPs repre- 
sented by TEMPI-G'^ In this analysis, we focused on the follow- 
ing five GOBPs: GOBPs 1-2, DNA-dependent DNA replication 
(DR) and DNA integrity checkpoint (DIC) in Group 1; GOBPs 
3-4, spindle checkpoint (SC) and regulation of G2/M transition 
of mitotic cell cycle (G2M) in Group 2; and GOBPs 5, CCS in 
Group 3. The interaction degrees revealed the following tem- 
poral interplays between the five GOBPs (Fig. 4B): (i) DIC 
and DR in Groups 1 and 2, respectively, during GO, Gl and S 
phases (t= 1-3); (ii) DIC and SC, as well as DIC and G2M, in 
Group 1 at S phase (t = 3); (iii) SC and G2M in Group 1 at S and 
G2 phases (^ = 3^); and (iv) SC and CCS in Group 4 at G2 
phase (^ = 4). Some of these interplays have been previously re- 
ported. Noh et al. (2009) reported that downregulation of G2/M 
transition-related genes (e.g. PLKl and SURVIVIN) led to a 
defect in mitotic spindle, and Gardner et al. (1999) reported 
that two DIC-related genes, RAD53 and DUNl, are required 
for establishment and maintenance of G2/M arrest. TEMPI-C' 
showed temporal transition of the nodes with the five GOBPs 
and their edges (G2M and CCS in first row of Fig. 4C, SC and 
G2M in second row of Fig. 4C and Supplementary Fig. S7 for 
individual GOBPs), consistent to the transitions of the GOBPs in 
Figure 4 A and B. 

These findings can provide novel insights into the interplays 
among the GOBPs during the cell cycle. TEMPI-G^ at S phase 
(t = 3) showed the strongest associations among the GOBPs. To 
investigate novel insights represented by the associations, we 
built a subnetwork (Fig. 4D) including the nodes with the four 
GOBPs (DIC, DR, SC and G2M) strongly interacting at S phase 
(t = 3). To reduce the complexity, we focused on the RAD53 
subnetwork (Fig. 4D) including RAD 53, a key molecule in 
DIC and DR, and its 36 interactors involved in the four 
GOBPs. Of the 36 interactors, only three are included in the 
input PPIs (G^), while the others are predicted. To assess the 
reliability of the predicted interactors of RAD53, we obtained 
9850 interactions among the 755 DRNs from an independent 
PPI database, BioGrid (Stark et al, 2006), and confirmed that 
18 of the 33 predicted interactors were previously detected by 
high-throughput inter act ome analyses, supporting the validity of 
TEMPI in the edge prediction. For example, TEMPI estimated 
the interactions of RAD53 with the 23 nodes ('x' nodes in 
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Fig. 4. Application of TEMPI to the cell cycle data. (A) Activation degree heat map of the 14 cell cycle-related GOBPs: Group 1, cell cycle checkpoint- 
related GOBPs; Group 2, DNA replication-related GOBPs; Group 3, GOBPs related to the process followed by the DNA replication and Group 4, M 
phase -related GOBPs. The five GOBPs (DIG, SC, G2M, DR and CCS) are denoted by the indicated symbols. (B) Interaction degree heat maps of the 
five GOBPs. (C) TDNs showing temporal transition of the network structures for the DRNs with G2M and CCS (first row) and the DRNs with SC and 
G2M (second row). See Supplementary Figure S7 for TDNs for the DICs with CR. See the legend of Figure 2C for descriptions of shapes and colors of 
nodes and edges. (D) A detailed subnetwork including RAD53 interactors at S phase. Symbols in A were used to distinguish DRNs with the five GOBPs, 
and the symbols are transparent if a[- = 0 and solid otherwise 



Fig. 4D) involved in DR. Among them, RAD53-DBF4 and 
DUNl was included in G', while the other interactions are 
newly predicted. Of the 21 predicted interactors, 14 (e.g. CTF4, 
CDC45 and CDC7) are reported to interact with RAD53 ac- 
cording to BioGrid. Moreover, TEMPI newly predicted the 
seven interactors of RAD 5 3 involved in G2M ('*' nodes in 
Fig. 4D), where three of them (CLB2, CLB5 and IPLl) are re- 
ported to interact with RAD53 according to BioGrid. Similarly, 
TEMPI identified seven interactors of RAD53 involved in DIG 
(□' nodes in Fig. 4D). Two of them (DUNl and RAD9) were 
included in G', and the other five were reported to interact with 
RAD53 according to BioGrid. The molecules with newly pre- 
dicted interactions in the RAD 5 3 subnetwork are known to be 
involved in DIG, DR, SC or G2M, independently of RAD53- 
dependent regulation of the cell cycle at S phase. Thus, all these 
data indicate novel insights into potential roles of these molecules 
in the RAD53-dependent regulation of the cell cycle at S phase. 
Many subnetworks can be analyzed in the same way to generate 
hypotheses for novel mechanisms underlying dynamic regulation 
of cell cycle. 

We further compared the performance of TEMPI on the cell 
cycle data with that of PNA. PNA produced the 10 PSs. By 
combining them, we generated TDNs as described above 
(PNA-G's in Supplementary Fig. S8). PNA-G's were significantly 
sparse, compared with TEMPI-Cs (Supplementary Fig. S8), be- 
cause PNA used only the sparse real PPIs (1425 PPIs between 
755 nodes) in G^, whereas TEMPI predicted a significant number 
of novel edges among the DRNs (Supplementary Fig. S9A). As 
described above, we assessed the reliability of the predicted PPIs 



by examining how many of them are reported in the BioGrid 
database (Supplementary Fig. S9B-C). The fraction of the pre- 
dicted PPIs reported in BioGrid (0.254) was significantly larger 
than the random expectation (0.0346). Moreover, TEMPI-C' 
captured the larger number of cell cycle-related GOBPs 
(120 GOBPs) with pulsed activation patterns than PNA-C' 
(16 GOBPs), suggesting that TEMPI-G'^ more effectively cap- 
tured cell cycle-related functional transition than PNA-C' 
(Supplementary Fig. S9D-F). The comparison of the distribu- 
tion of degrees and CC also showed that TEMPI-C' is 
more dense and modular than PNA-G"' (Supplementary 
Fig. S9G-H). See Supplementary Information S4.3 for further 
details. Also, we applied TEMPI to the gene expression data 
collected from the mutant yeasts with defects in cell cycle, com- 
pared activation/interaction degrees of cell cycle-related GOBPs 
between wild-type and mutant yeasts, and examined deregulation 
of wild-type TEMPI-C' in mutant yeasts (Supplementary 
Information S4.4). 



4 CONCLUSIONS 

In this study, we developed TEMPI that effectively estimates 
TDNs associated with activated GOBPs over time by integrating 
time-course gene expression, PPI and GOBP data. TEMPI pro- 
vides activation and interaction degrees of GOBPs, facihtating 
the interpretation of temporal activations and interplays of 
GOBPs represented by the estimated TDNs. This interpretation 
leads to generation of TDN-driven hypotheses for key pathways 
regulating cellular events under investigation (see Section 4.3). 
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Thus, TEMPI can serve as a useful tool that provides hypotheses 
for the mechanisms underlying functional transitions in various 
problems in time- varying biological systems. See Supplementary 
Information S5 and S6 for implementation and limitations of 
TEMPI, respectively, which include potential limited applicabil- 
ity of TEMPI to other types of interactions than PPIs, sensitivity 
to completeness of the input PPIs and the scalability issue. 
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